模拟DAG模型可能表现出属性,也许无意中,使其结构识别和意外地影响结构学习算法。在这里,我们表明边缘方差往往沿着仿制性添加添加剂噪声模型的因果顺序增加。我们将Varsortable介绍为衡量衡量边际差异和因果顺序的秩序之间的协议。对于通常采样的图形和模型参数,我们表明,一些连续结构学习算法的显着性能可以通过高的Varsortable解释,并通过简单的基线方法匹配。然而,这种性能可能不会转移到真实世界的数据,其中VARS使性可能是中等或取决于测量尺度的选择。在标准化数据上,相同的算法无法识别地面真理DAG或其Markov等价类。虽然标准化在边缘方差中删除了模式,但我们表明,数据产生过程,其产生高VILS使性也留下了即使在标准化之后也可以利用不同的协方差模式。我们的调查结果挑战了独立绘制参数的通用基准的重要性。代码可在https://github.com/scriddie/varsortable获得。
translated by 谷歌翻译
Nucleolar organizer regions (NORs) are parts of the DNA that are involved in RNA transcription. Due to the silver affinity of associated proteins, argyrophilic NORs (AgNORs) can be visualized using silver-based staining. The average number of AgNORs per nucleus has been shown to be a prognostic factor for predicting the outcome of many tumors. Since manual detection of AgNORs is laborious, automation is of high interest. We present a deep learning-based pipeline for automatically determining the AgNOR-score from histopathological sections. An additional annotation experiment was conducted with six pathologists to provide an independent performance evaluation of our approach. Across all raters and images, we found a mean squared error of 0.054 between the AgNOR- scores of the experts and those of the model, indicating that our approach offers performance comparable to humans.
translated by 谷歌翻译
Mitotic activity is key for the assessment of malignancy in many tumors. Moreover, it has been demonstrated that the proportion of abnormal mitosis to normal mitosis is of prognostic significance. Atypical mitotic figures (MF) can be identified morphologically as having segregation abnormalities of the chromatids. In this work, we perform, for the first time, automatic subtyping of mitotic figures into normal and atypical categories according to characteristic morphological appearances of the different phases of mitosis. Using the publicly available MIDOG21 and TUPAC16 breast cancer mitosis datasets, two experts blindly subtyped mitotic figures into five morphological categories. Further, we set up a state-of-the-art object detection pipeline extending the anchor-free FCOS approach with a gated hierarchical subclassification branch. Our labeling experiment indicated that subtyping of mitotic figures is a challenging task and prone to inter-rater disagreement, which we found in 24.89% of MF. Using the more diverse MIDOG21 dataset for training and TUPAC16 for testing, we reached a mean overall average precision score of 0.552, a ROC AUC score of 0.833 for atypical/normal MF and a mean class-averaged ROC-AUC score of 0.977 for discriminating the different phases of cells undergoing mitosis.
translated by 谷歌翻译
Capturing large fields of view with only one camera is an important aspect in surveillance and automotive applications, but the wide-angle fisheye imagery thus obtained exhibits very special characteristics that may not be very well suited for typical image and video processing methods such as motion estimation. This paper introduces a motion estimation method that adapts to the typical radial characteristics of fisheye video sequences by making use of an equisolid re-projection after moving part of the motion vector search into the perspective domain via a corresponding back-projection. By combining this approach with conventional translational motion estimation and compensation, average gains in luminance PSNR of up to 1.14 dB are achieved for synthetic fish-eye sequences and up to 0.96 dB for real-world data. Maximum gains for selected frame pairs amount to 2.40 dB and 1.39 dB for synthetic and real-world data, respectively.
translated by 谷歌翻译
Computer-aided systems in histopathology are often challenged by various sources of domain shift that impact the performance of these algorithms considerably. We investigated the potential of using self-supervised pre-training to overcome scanner-induced domain shifts for the downstream task of tumor segmentation. For this, we present the Barlow Triplets to learn scanner-invariant representations from a multi-scanner dataset with local image correspondences. We show that self-supervised pre-training successfully aligned different scanner representations, which, interestingly only results in a limited benefit for our downstream task. We thereby provide insights into the influence of scanner characteristics for downstream applications and contribute to a better understanding of why established self-supervised methods have not yet shown the same success on histopathology data as they have for natural images.
translated by 谷歌翻译
The pre-training of masked language models (MLMs) consumes massive computation to achieve good results on downstream NLP tasks, resulting in a large carbon footprint. In the vanilla MLM, the virtual tokens, [MASK]s, act as placeholders and gather the contextualized information from unmasked tokens to restore the corrupted information. It raises the question of whether we can append [MASK]s at a later layer, to reduce the sequence length for earlier layers and make the pre-training more efficient. We show: (1) [MASK]s can indeed be appended at a later layer, being disentangled from the word embedding; (2) The gathering of contextualized information from unmasked tokens can be conducted with a few layers. By further increasing the masking rate from 15% to 50%, we can pre-train RoBERTa-base and RoBERTa-large from scratch with only 78% and 68% of the original computational budget without any degradation on the GLUE benchmark. When pre-training with the original budget, our method outperforms RoBERTa for 6 out of 8 GLUE tasks, on average by 0.4%.
translated by 谷歌翻译
在神经网络应用中,不足的培训样本是一个常见的问题。尽管数据增强方法至少需要最少数量的样本,但我们提出了一种基于新颖的,基于渲染的管道来合成带注释的数据集。我们的方法不会修改现有样本,而是合成全新样本。提出的基于渲染的管道能够在全自动过程中生成和注释合成和部分真实的图像和视频数据。此外,管道可以帮助获取真实数据。拟议的管道基于渲染过程。此过程生成综合数据。部分实现的数据使合成序列通过在采集过程中合并真实摄像机使综合序列更接近现实。在自动车牌识别的背景下,广泛的实验验证证明了拟议的数据生成管道的好处,尤其是对于具有有限的可用培训数据的机器学习方案。与仅在实际数据集中训练的OCR算法相比,该实验表明,角色错误率和错过率分别从73.74%和100%和14.11%和41.27%降低。这些改进是通过仅对合成数据训练算法来实现的。当另外合并真实数据时,错误率可以进一步降低。因此,角色错误率和遗漏率可以分别降低至11.90%和39.88%。在实验过程中使用的所有数据以及针对自动数据生成的拟议基于渲染的管道公开可用(URL将在出版时揭示)。
translated by 谷歌翻译
在本文中,我们提出了一个用于光学特征识别(OCR)的数据增强框架。所提出的框架能够合成新的视角和照明方案,从而有效地丰富任何可用的OCR数据集。它的模块化结构允许修改以符合单个用户需求。该框架使得可以舒适地扩展可用数据集的扩大因子。此外,所提出的方法不仅限于单帧OCR,但也可以应用于视频OCR。我们通过扩大普通BRNO移动OCR数据集的15%子集来证明框架的性能。我们提出的框架能够利用OCR应用程序的性能,尤其是对于小型数据集。应用提出的方法,在字符错误率(CER)方面提高了多达2.79个百分点,并在子集中获得了高达7.88个百分点。特别是可以改善对具有挑战性的文本线条的认识。该类别的CER可能会降低14.92个百分点,而该级别的CER可下降到18.19个百分点。此外,与原始的非仪式完整数据集相比,使用建议方法的15%子集进行训练时,我们能够达到较小的错误率。
translated by 谷歌翻译
神经机器翻译(NMT)是一个开放的词汇问题。结果,处理在培训期间没有出现的单词(又称唱歌外(OOV)单词)长期以来一直是NMT系统的基本挑战。解决此问题的主要方法是字节对编码(BPE),将包括OOV单词在内的单词分为子字段中。在自动评估指标方面,BPE为广泛的翻译任务取得了令人印象深刻的结果。尽管通常假定使用BPE,但NMT系统能够处理OOV单词,但BPE在翻译OOV单词中的有效性尚未明确测量。在本文中,我们研究了BPE在多大程度上成功地翻译了单词级别的OOV单词。我们根据单词类型,段数,交叉注意权重和训练数据中段NGram的段频率分析OOV单词的翻译质量。我们的实验表明,尽管仔细的BPE设置似乎在整个数据集中翻译OOV单词时相当有用,但很大一部分的OOV单词被错误地翻译而成。此外,我们强调了BPE在为特殊案例(例如命名本性和涉及的语言彼此接近的语言)翻译OOV单词中的有效性稍高。
translated by 谷歌翻译
法医车牌识别(FLPR)仍然是在法律环境(例如刑事调查)中的公开挑战,在刑事调查中,不可读取的车牌(LPS)需要从高度压缩和/或低分辨率录像(例如监视摄像机)中解密。在这项工作中,我们提出了一个侧面信息变压器体系结构,该结构嵌入了输入压缩级别的知识,以改善在强压缩下的识别。我们在低质量的现实世界数据集上显示了变压器对车牌识别(LPR)的有效性。我们还提供了一个合成数据集,其中包括强烈退化,难以辨认的LP图像并分析嵌入知识对其的影响。该网络的表现优于现有的FLPR方法和标准最先进的图像识别模型,同时需要更少的参数。对于最严重的降级图像,我们可以将识别提高多达8.9%。
translated by 谷歌翻译